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Abstract 

We develop a Bayesian nonparametric framework for modeling ordinal regression relation¬ 
ships which evolve in discrete time. The motivating application involves a key problem 
in fisheries research on estimating dynamically evolving relationships between age, length 
and maturity, the latter recorded on an ordinal scale. The methodology builds from non¬ 
parametric mixture modeling for the joint stochastic mechanism of covariates and latent 
continuous responses. This approach yields highly flexible inference for ordinal regression 
functions while at the same time avoiding the computational challenges of parametric 
models. A novel dependent Dirichlet process prior for time-dependent mixing distribu¬ 
tions extends the model to the dynamic setting. The methodology is used for a detailed 
study of relationships between maturity, age, and length for Chilipepper rockfish, using 
data collected over 15 years along the coast of California. 
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1 Introduction 


Consider ordinal responses collected along with covariates over discrete time. Furthermore, 
assnme multiple observations are recorded at each point in time. This article develops 
Bayesian nonparametric modeling and inference for a discrete time series of ordinal regression 
relationships. Our aim is to provide flexible inference for the series of regression functions, 
estimating the unique relationships present at each time, while introducing dependence by 
assuming each distribution is correlated with its predecessors. 

Environmental characteristics consisting of ordered categorical and continuous measure¬ 
ments may be monitored and recorded at different points in time, requiring a model for the 
temporal relationships between the environmental variables. The relationships present at a 
particular point in time are of interest, as well as any trends or changes which occur over 
time. Empirical distributions in environmental settings may exhibit non-standard features 
including heavy tails, skewness, and multimodality. To capture these features, one must move 
beyond standard parametric models in order to obtain more flexible inference and prediction. 

The motivating application for this work lies in modeling fish maturity as a function of age 
and length. This is a key problem in fisheries science, one reason being that estimates of age at 


maturity play an important role in population estimates of sustainable harvest rates (Clark 


1991; Hannah et ah, 2009). The specific data set comes from the National Marine Fisheries 


Service and consists of year of sampling, age recorded in years, length in millimeters, and 
maturity for female Chilipepper rockfish, with measurements collected over 15 years along the 
coast of California. Maturity is recorded on an ordinal scale, with values taken to be from 1 
through 3, where 1 indicates immature and 2 and 3 represent pre- and post-spawning mature, 
respectively. More details on the data are provided in Sectionj^ Exploratory analysis suggests 
both symmetric, unimodal as well as less standard shapes for the marginal distributions of 
length and age; histograms for three years are shown in Figure Bivariate data plots of 
age and length suggest similar shapes across years, with some differences in location and 
scale, and clear differences in sample size, as can be seen from Figure To make the plot 


2 








t=1994 


t=2000 


t=1998 



250 350 450 550 

length (mm) 



250 300 350 400 450 500 
length (mm) 



250 300 350 400 450 500 
length (mm) 




5 10 15 20 


age (years) 


age (years) 


age (years) 


Figure 1: Data histograms of length and age for female Chilipepper rockfish in years 1994, 
1998, and 2000. 

more readable, random noise has been added to age, which is recorded on a discretized scale. 
Maturity level is also indicated; red color represents immature, green pre-spawning mature, 
and blue post-spawning mature. Again, there are similarities including the concentration of 
immature fish near the lower left quadrants, but also differences such as the lack of immature 
fish in years 1995 through 2000 as compared to the early and later years. 

In addition to studying maturity as a function of age and length, inference for the age 
and length distributions is also important. This requires a joint model which treats age 
and length as random in addition to maturity. We are not aware of any existing modeling 
strategy for this problem which can handle multivariate mixed data collected over time. 
Compromising this important aspect of the problem, and assuming the regression of maturity 
on body characteristics is the sole inferential objective, a possible approach would be to use an 
ordered probit regression model. Empirical (data-based) estimates for the trend in maturity 
as a function of length or age indicate shapes which may not be captured well by a parametric 
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Figure 2: Bivariate plots of length versus age at each year of data, with data points colored 
according to maturity level. Red represents level 1, green level 2, and blue level 3. Values of 
age have been jittered to make the plots more readable. 


model. For instance, the probability a fish is immature (level 1) is generally decreasing with 
length, however, in some of the years, the probability a fish is post-spawning mature (level 3) 
is increasing up to a certain length value and then decreasing. This is not a trend that can be 
captured by parametric models for ordinal regression ( jBoes and Winkelmann 2006, discuss 
some of these properties). One could include higher order and/or interaction terms, though 
it is not obvious which terms to include, and how to capture the different trends across years. 

In practice, virtually all methods for studying maturity as a function of age and/or length 
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use logistic regression or some variant, often collapsing maturity into two levels (immature 
and mature) and treating each covariate separately in the analysis (e.g., Hannah et al.[ 2009[ 


Bobko and Berkeley, 2004). Bobko and Berkeley (2004) applied logistic regression with 


length as a covariate, and to obtain an estimate of age at 50% maturity (the age at which 
50% of fish are mature), they used their estimate for length at 50% maturity and solved 
for the corresponding age given by the von Bertalanffy growth curve, which relates age to 
length using a particular parametric function. Others assume that maturity is independent 
of length after conditioning on age, leading to inaccurate estimates of the proportion mature 


at a particular age or length (Morgan and Hoenig, 1997). 

We develop a nonparametric Bayesian model to study time-evolving relationships between 
maturation, length, and age. These three variables constitute a random vector, and although 
maturity is recorded on an ordinal scale, it is natural to conceptualize an underlying continu¬ 
ous maturation variable. Distinguishing features of our approach include the joint modeling 
for the stochastic mechanism of maturation and length and age, and the ability to obtain flex¬ 
ible time-dependent inference for multiple ordinal maturation categories. While estimating 
maturity as a function of length and age is of primary interest, the joint modeling framework 
provides inference for a variety of functionals involving the three body characteristics. 

Our goal is to construct a modeling framework for dynamic ordinal regression which 
avoids strict parametric assumptions and possesses features that make it well-suited to the 
fish maturity application, as well as to similar evolutionary biology problems on studying 
natural selection characteristics (such as survival or maturation) in terms of phenotypic traits. 


To this end, we build on previous work on ordinal regression not involving time (DeYoreo and 


Kottas, 2014a), where the ordinal responses arise from latent continuous variables, and the 


joint latent response-covariate distribution is modeled using a Dirichlet process (DP) mixture 
of multivariate normals (Muller et al., [1996 ). In the context of the rockfish data, we model 
maturity, length, and age jointly, using a DP mixture. This modeling approach is further 
developed here to handle ordinal regressions which are indexed in discrete time, through 


use of a new dependent Dirichlet process (DDP) prior (MacEachern, 1999 2000), which 
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estimates the regression relationship at each time point in a flexible way, while incorporating 
dependence across time. 


We review the model for ordinal regression without the time component in Section 2.1 


Section 2.2 introduces the DDP, and in Section 2.3 we develop a new method for incorpo¬ 
rating dependence in the DP weights to handle distributions indexed in discrete time. The 
model is then developed further in Sectionin the context of the motivating application, and 
applied to analyze the rockfish data discussed above. Section concludes with a discussion. 
Technical details on properties of the DDP prior model, and on the posterior simulation 
method are included in the appendixes. 


2 Modeling Framework 


2.1 Bayesian Nonparametric Ordinal Regression 


We first describe our approach to regression in the context of a single distribution, that is, 
without any aspect of time. Let {(y*, Xi) : i = 1,..., n} denote the data, where each obser¬ 
vation consists of an ordinal response y* along with a vector of covariates Xi = {xn ,..., Xip). 


The methodology is developed in DeYoreo and Kottas (2014a) for multivariate ordinal re¬ 
sponses, however we work with a univariate response for notational simplicity and because 
this is the relevant setting for the application of interest. Our model assumes the ordinal 
responses arise as discretized versions of latent continuous responses, which is natural for 
many settings and particularly relevant for the fish maturity application, as maturation is a 
continuous variable recorded on a discrete scale. With C categories, introduce latent contin¬ 
uous responses {Zi ,..., Zn) such that = y if and only if Zj G ( 7 j_i, yy], for j = 1 ,..., C, 
and cut-offs —oo = 70 < 71 < • • • < 7 c = oo. 


We focus on settings in which the covariates may be treated as random, which is appro¬ 
priate, indeed necessary, for many environmental applications. In the fish maturity example, 
the body characteristics are interrelated and arise in the form of a data vector, and we 
are interested in various relationships, including but not limited to the way in which ma- 
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turity varies with age and length. This motivates our focus on building a flexible model 
for the joint density f{z,x), for which we apply a DP mixture of multivariate normals: 


{zi,Xi) I G ~/ N(-;/x,S)dG(/2,5]), with G | a,Go ~ DP(a,Go). 


By the constructive definition of the DP (Sethuraman, 1994), a realization G from a 


DP(a, Go) is almost surely of the form G = Ylt^iPi^Or The locations 0; = (/ 2 j,S;) are 
independent realizations from the centering distribution Go, and the weights are deter¬ 
mined through stick-breaking from beta distributed random variables. In particular, let 
vi beta(l, a), I = 1,2,, independently of {Oi}, and define pi = vi, and for / = 2, 3,..., 
Pi = vi nt=i(l “ '^r)- Therefore, the model for f{z, x) has an almost sure representation as a 
countable mixture of multivariate normals, and implies the following induced model for the 
regression function: 


Pr(y = j\ x-,G) = ^Wr{ 


r=l 


pj 

X) I l>i(z;mr{x), Sr) dz 
7j-i 


( 1 ) 


with covariate-dependent weights Wr{x) oc prN(x; covariate-dependent means mr{x) 

pI -|- and variances Sr = Here, Pr is partitioned 

into Pr Pr according to Z and X, and 5]^^) are the components of the 

corresponding partition of covariance matrix 1]^. 

This modeling strategy allows for non-linear, non-standard relationships to be captured, 
and overcomes many limitations of standard parametric models. In addition, the cut-offs 
may be fixed to arbitrary increasing values (which we recommend to be equally spaced and 
centered at zero) without sacrihcing the ability of the model to approximate any distribution 
for mixed ordinal-continuous data. In particular, it can be shown that the induced prior 
model on the space of mixed ordinal-continuous distributions assigns positive probability to 
all Kullback-Leibler neighborhoods of any distribution in this space. This represents a key 


computational advantage over parametric models. We refer to DeYoreo and Kottas (2014a) 


for more details on model properties, a review of existing approaches to ordinal regression, 
and illustration of the benefits afforded by the nonparametric joint model over standard 
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methods. This discussion refers to ordinal responses with three or more categories. For 
the case of binary regression, i.e., when (7 = 2, additional restrictions are needed on the 


covariance matrix S to facilitate identifiability (DeYoreo and Kottas 2014b). 


2.2 Dependent Dirichlet Processes 

In developing a model for a collection of distributions indexed in discrete time, we seek to 
build on previous knowledge, retaining the powerful and well-studied DP mixture model 
marginally at each time t £ T, with T = {1,2,...}. We thus seek to extend the DP prior 
to model Gj- = {Gt : t G T}, a set of dependent distributions such that each Gt follows 
a DP marginally. The dynamic DP extension can be developed by introducing temporal 
dependence in the weights and/or atoms of the constructive definition, G = Yl’^iPi^Or 
The general formulation of the DDP introduced by MacEachern ( 1999[ 


expresses 


the atoms Ois = {Oi^t '■ t £ •S'}, / = 1, 2,... as independently and identically distributed (i.i.d.) 
sample paths from a stochastic process over S', and the latent beta random variables which 
drive the weights, vis = {u;y : t G Sj, I = 1,2,, as i.i.d. realizations from a stochastic 
process with beta(l,at) marginal distributions. The distributions could be indexed in time, 
space, or by covariates, and S represents the corresponding index set, being Z+ in our case. 
The DDP model for distributions indexed in discrete time expresses Gt as t > 

t gT. The locations Oij- = {Oi^t '■ t G T} are i.i.d. for Z = 1, 2,..., from a time series model 
for the kernel parameters. The stick-breaking weights Pij- = {pi^t : t G T}, 1 = 1,2,..., arise 
through a latent time series with beta(l,Q!t) marginal distributions, independently of Oip. 

The general DDP can be simplified by introducing dependence only in the weights, such 
that the atoms are not time dependent, or alternatively, the atoms can be time dependent 
while the weights remain independent of time. We refer to these as common atoms and 
common weights models, respectively. While the majority of DDP applications fall into 
the common weights category, we believe the most natural of the two simplifications for time 
series is to assume that the locations are constant over time, and introduce dependence in the 
weights. Although we will end up working with the more general version of the DDP with 















dependent atoms and weights, for the application and related settings, it seems plausible 
that there is a fixed set of factors that determine the region in which the joint density 
of body characteristics is supported, but dynamics are caused by changes in the relative 
importance of the factors. The construction of dependent weights requires dependent beta 
random variables, so that pi^t = Pi,t = — Vr,t), for / = 2, 3,..., with each 

{vi^t ■ t £ T} a realization from a time series model with beta(l, a) marginals. Equivalently, 
we can write = 1 - pi^t = (1 - A,i) nt=i for / = 2, 3,..., with each {Pi^t -t gT} 
a realization from a time series model with beta(Q!, 1) marginals. 

There have been many variations of the DDP model proposed in the literature. The com¬ 


mon weights version was originally discussed by MacEachern] (2000), in which a Gaussian 
process was used to generate dependent locations, with the autocorrelation function control¬ 
ling the degree to which distributions which are “close” are similar, and how quickly this 


similarity decays. De lorio et al. (2004) consider also a common weights model, in which the 
index of dependence is a covariate, a key application of DDP models. In the order-based DDP 


of Griffin and Steel (2006), covariates are used to sort the weights. Covariate dependence is 
incorporated in the weights in the kernel and probit stick-breaking models of|Dunson and| 


Park (2008) and Rodriguez and Dunson (2011), respectively, however these prior models do 


not retain the DP marginally. Gelfand et al. (2005) developed a DP mixture model for spatial 
data, using a spatial Gaussian process to induce dependence in DDP locations indexed by 


space. For data indexed in discrete time, Rodriguez and ter Horst (2008) apply a common 


weights model, with atoms arising from a dynamic linear model. Di Lucca et al. (2013) 
develop a model for a single time series of continuous or binary responses through a DDP 


in which the atoms are dependent on lagged terms. Xiao et al. (2015) construct a dynamic 
model for Poisson process intensities built from a DDP mixture with common weights and dif¬ 


ferent types of autoregressive processes for the atoms. Taddy (2010) assumes the alternative 
simplification of the DDP with common atoms, and models the stick-breaking proportions 


{vi^t ■ t G T} using the positive correlated autoregressive beta process from McKenzie (1985). 


Nieto-Barajas et al. (2012) also use the common atoms simplification of the DDP, modeling 
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a time series of random distributions by linking the beta random variables through latent 
binomially distributed random variables. 

2.3 A Time-Dependent Nonparametric Prior 

To generate a correlated series {Pi^t '■ t G T} such that each /3; ^ ~ beta(a, 1) marginally, we 
define a stochastic process 

^ = |/3t = exp ^ ^ > (2) 

which is built from a standard normal random variable C and an independent stochastic pro¬ 
cess rjj- = {r]t : t G T} with standard normal marginal distributions. This transformation 
leads to marginal distributions f3t ~ beta(a, 1) for any t. To see this, take two independent 
standard normal random variables Yf and 12; such that W = {Y^ + follows an expo¬ 

nential distribution with mean 1, and thus B = exp(—IT/a) ~ beta(a, 1). To our knowledge, 
this is a novel construction for a common atoms DDP prior model. The practical utility 
of the transformation in (§ is that it facilitates building the temporal dependence through 
Gaussian time-series models, while maintaining the DP structure marginally. 

Because we work with distributions indexed in discrete time, we assume rjj- to be a 
first-order autoregressive (AR) process, however alternatives such as higher order processes 
or Gaussian processes for spatially indexed data are possible. The requirement of standard 
normal marginal distributions on rjj- leads to a restriction on the variance of the AR(1) model, 
such that r]i^t ~ t = 2,...,T. Thus \4>\ < 1, which implies stationarity 

for the stochastic process rj'f. Since is a transformation of a strongly stationary stochastic 
process, it is also strongly stationary. Note that the correlation in Pi,t+k) is driven by 
the autocorrelation present in r/y-, and this induces dependence in the weights {pi,t,Pi,t+k), 
which leads to dependent distributions {Gt,Gt+k)- 

We now explore this dependence, discussing some of the correlations implied by this prior 
model. First, consider the correlation of the beta random variables used to define the dynamic 
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stick-breaking weights. Let pk = coix{rit,pt+k), which is equal to (f)^ under the assumption of 
an AR(1) process for r/ 7 -. The autocorrelation function associated with B is 


corj:{/3t, Pt+k \ a A) 


-h l)^(a -k 2)^/^ 


{{1- pl + ay -Apl} 


1/2 


— cx(^cx 2 ) 


(3) 


as described in Appendix]^ Smaller values for a lead to smaller correlations for any fixed </> at 
a particular lag, and cj) controls the strength of correlation, with large (p producing large corre¬ 
lations which decay slowly. The parameters cp and a in combination can lead to a wide range 
of correlations, however a > 1 implies a lower bound near 0.5 on the correlation for any lag k. 
In the limit, as a —)■ O'*', coii{f3tAt+k \ ctA) ^ 0; and as a — 00 , coir {A, f3t+k \ C(,<P) tends 
towards 0.5 as pk —>■ 0"^, and 1 as —)> 1“. Assuming pk = (p^, gives lim^^]^- coTv{[3t, Pt+k \ 
«,(/)) = 1 and lim^^o+ con {A, Pt+k \ ct A) = + 1)(q; -|- 2)^/^ — a{a + 2). This tends 

upwards to 0.5 quickly as a —)■ 00 . 

Note that corr(/3i, pt+k \ ct, cp) is a function of but not pfc, which is to be expected since 
pt enters the expression for Pt only through and thus —pk and pk have the same effect 
in the correlation. The same is true of the correlation in the DP weights, which is given 
below. We believe that cp G (0,1) is a natural restriction, since we are building a stochastic 
process for distributions correlated in time through a transformation of an AR process, which 
intuition suggests should be positively correlated. However, all that is strictly required to 
preserve the DP marginals is \<p\ < 1. 

Assuming Piq- = {/3; t : t G T} is generated by B, from an underlying AR(1) process 
for pj- with coefficient </>, we study the dependence induced in the resulting DDP weights at 
consecutive time points, The covariance is given by 


cov{ppt,Pi,t+i \aA) = 


«3/2(i _ ^2)1/2 


i-i 


(2 -k {{1 — A + a)2 — 


2a 

1-r + 


„3/2(i_ ^2^1/2 


a 


21-2 


« + l (2-ka)V2{(l_</,2 + Q;)2_Q-2(^2}l/2 j (l + a)2' 


, (4) 


which can be divided by var(p;^f | a) = {2a ^/((l-k a)( 2 -k a) )} — {a^ ^/(l-ka)^ } to yield 
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coTv{pi^t,Pi,t+i I note that E{pi^t \ a) = E{pi^t+i \ a) and yaT{pi,t \ a) = Yav{pi^t+i \ a). 
Derivations are given in Appendix This expression is decreasing in index I, and larger 
values of cf) lead to larger correlations in the weights at any particular 1. Moreover, the 
decay in correlations with weight index is faster for small a and small (j). As a —)• 0^, 
coiT{pi^t,Pi,t+i I CK)*/') 1 for any value of 4>, and as a —)• oo, cor:r{pi^t,pi^t+i \ Ci,4>) is 

contained in (0.5,1), with values closer to 1 for larger (f). Note that con:{pi^t,Pi,t+k I cx, 4>) has 
the same expression as coTT{pi^t,Pi,t+i \ but with cj) replaced by 0^; it is thus decreasing 

with the lag k, with the speed of decay controlled by </>. 

Finally, assume Gt is a random distribution on M^, such that Gt = Yl'i^iPi,t^0v where 
the {pi^t ■ t £ T} are defined through the dynamic stick-breaking weights {/3i^t ■ t G T}, 
and the 6i are i.i.d. from a distribution Go on Consider two consecutive distributions 
{Gt,Gt+i), and a measurable subset A C . The correlation of consecutive distributions, 
corv{Gt{A),Gt+i{A) \ (j),a,Go), is discussed in Appendix [A} 

3 Estimating Maturity of Rockfish 

3.1 Chilipepper Rockfish Data 

We now utilize the method for incorporating dependence into the weights of the DP and 
the approach to ordinal regression involving DP mixtures of normals for the latent response- 
covariate distribution to further develop the DDP mixture modeling framework for dynamic 
ordinal regression. 

In the original rockfish data source, maturity is recorded on an ordinal scale from 1 
to 6, representing immature (1), early and late vitellogenesis (2,3), eyed larvae (4), and 
post-spawning (5,6). Because scientists are not necessarily interested in differentiating be¬ 
tween every one of these maturity levels, and to make the model output simple and more 
interpretable, we collapse maturity into three ordinal levels, representing immature (1), pre¬ 
spawning mature (2,3,4), and post-spawning mature (5,6). 

Many observations have age missing or maturity recorded as unknown. Exploratory 
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analysis suggests there to be no systematic pattern in missingness. Further discussion with 
fisheries research scientists having expertise in aging of rockfish and data collection revealed 
that the reason for missing age in a sample is that otoliths (ear stones used in aging) were 
not collected or have not yet been aged. Maturity may be recorded as unknown because 
it can be difficult to distinguish between stages, and samplers are told to record unknown 
unless they are reasonably sure of the stage. Therefore, there is no systematic reason that 
age or maturity is not present, and it is reasonable to assume that the data are missing at 
random, or that the probability an observation is missing does not depend on the missing 
values, allowing us to ignore the missing data mechanism, and base inferences only on the 
complete data (e.g., Rubin, 1976 Gehnan et ah, 2004). 

Age can not be treated as a continuous covariate, as there are approximately 25 distinct 
values of age in over 2, 200 observations. Age is in fact an ordinal random variable, such that 
a recorded age j implies the fish was between j and j + 1 years of age. This relationship 
between discrete recorded age and continuous age is obtained by the following reasoning. 
Chilipepper rockfish are winter spawning, and the young are assumed to be born in early 
January. The annuli (rings) of the otiliths are connted in order to determine age, and these 
also form sometime around Jannary. Thus, for each ring, there has been one year of growth. 

We therefore treat age much in the same way as maturity, using a latent continuous 
age variable. Let U represent observed ordinal age, let U* represent underlying continuous 
age, and assume, for j = 1,2,..., that U = j iS U* G (j,j + 1]. Equivalently, U = j iff 
log{U*) G (log(j), log(j + 1)], for j = 1,2,..., and t/ = 0 iff log(f7*) G (—oo,0], so that the 
support of the latent continuous random variable corresponding to age is M. Letting W be 
the latent continuous random variable which determines U through discretization, we assnme 
ut,i = j iff wt^i G (log(j), log(j + 1)], for j = 0,1,..., so that W is interpretable as log-age on 
a continuous scale. 

Considering year of sampling as the index of dependence, observations occur in years 1993 
through 2007, indexed by t = 1,..., T = 15, with no observations in 2003, 2005, or 2006. Let 
the missing years be given by s, here s = {11,13,14}, and let = {1,..., Tj \ s represent 


Rubin, 1976 Gehnan et ah, 2004 
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all other years in {1,..., T}. This situation involving time points in which data is completely 
missing is not uncommon in these types of problems, and can be handled with our model 
for equally spaced time points. We retain the ability to provide inference and estimation 
in years for which no data was recorded, which we will see are reasonable and exhibit more 
uncertainty than in other years. 


3.2 Hierarchical Model and Implementation Details 


The common atoms DDP model presented in Section 2.3 was tested extensively on simulated 
data, and performed very well in capturing trends in the underlying distributions when there 
were no missing years or forecasting was not the focus. In these settings, the common atoms 
model is sufficiently powerful from an inferential perspective such that the need to turn to a 
more complex model is diminished. However, as a consequence of the need to force the same 
set of components to be present at each time point, density estimates at missing time points 
tend to resemble an average across all time points, which is not desirable when a trend or 
change in support is suggested. A more general model is required for this application, since 
it is important to make inferences in years for which data was not collected. We therefore 
consider a simple extension, adding dependence through a vector autoregressive model in the 
mean component of the DP atoms, such that 6i = (/X;, S;) becomes Oi^t = 

Assuming Z represents maturity T on a continuous scale, W is interpretable as log- 
age, and X represents length, a dependent nonparametric mixture model is applied to es¬ 
timate the time-dependent distributions of the trivariate continuous random vectors = 
{Zti, Wti, Xti), for t = 1,..., T, and i = 1,..., n^. In our notation, T is the number of years 
or the final year containing data (and it is possible that not all years in {1,... ,T} contain 
observations), and n* is the sample size in year t. 

We utilize the computationally efficient approach to inference which involves truncating 
the countable representation for each Gt to a finite level N ( Ishwaran and James[ 2001), such 
that the dependent stick-breaking weights are given hy pi^t = 1— Pi,t = (1—A,*) n!-=i 


for r = 1,..., A" — 1, and pN,t = ensuring Xlzli Pi,t = 1- Since a is not a function of 
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t, the same truncation level is applied for all mixing distributions. In choosing the truncation 
level, we use the expression relating N to the expectation of the sum of the first N weights 
rci,..., wn generated from stick-breaking of beta(l, a) random variables. The expression is 
E(^^i Wj \ a) = 1 — {a/{a + 1))'^, which can be further averaged over the prior for a to 
obtain tCj), with N chosen such that this expectation is close to 1 up to the desired 

level of tolerance for the approximation. The hierarchical model can be expressed as follows: 

yt,i = 3 ^ 7j-i < Zt,i < 7 j, t G s'", i = 1,..., m 
Ut,i = j ^ log(j) < Wt,i < log(j + 1), t e i = I,... ,nt 

nt 

{yU I ~ H 

t£s<: i=l 

nt N 

t&s‘^ i=l l=l 

0~N(0,1), / = 1,...,A^-1 

r?z,i ~ N(0,1), l = 

m,t I ~ 1 -</’^)> / = 1,...,iv-1, t = 2,...,r 

I mo, Vb ~ N(mo, Vb), / = l,...,iV 
yi,t I N(m-F0/X;_t_i, V), / = 1,..., iV, t = 2,..., T 

1 = 1,...,N (5) 

with priors on a, ip = {m,V,D), p, and 0. Recall that is defined through 
and the {f3i^t} determine the {pi^t} through stick-breaking. 

The parameters {Q} and {rji^t} can be updated individually with slice samplers, which 
involves alternating simulation from uniform random variables and truncated normal ran¬ 
dom variables, implying draws for and hence {pi^t}- The configuration variables 

Lt^i are drawn from discrete distributions on {1,..., A^}, with probabilities proportional to 
yip ’^i) for / = 1,..., IV. The update for S; is IW(i/ + Mi,D + J2{it,iy.Lu=i}(yt,i ~ 
yit){yti ~ yit)'^)^ where Mi = |{t, i} : Lt^i = l\, and each is updated from a normal 
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distribution. The parameters a and (j), given priors lG{aa,ba), and uniform on (0,1) or 
(—1,1), respectively, can be sampled using Metropolis-Hastings steps. We assume that 0 
is diagonal, with elements (91,62,0^), however we advocate for a full covariance matrix V. 
This implies that each element of /X; ^ has a mean which depends only on the corresponding 
element of however there exists dependence in the elements of /X; j, which seems rea¬ 

sonable for most applications. We assume uniform priors on (0,1) or (—1,1) for each element 
of 0, and update them with a Metropolis-Hastings step. Finally, the parameters ijy have 
closed-form full conditional distributions, given priors m ~ N{am, Bm), V ~ IW(av,.Bv), 
D W{ao, Bd). The full conditionals and posterior simulation details are further described 
in Appendix [B| 

To implement the model, we must specify the parameters of the hyperpriors on -0. A 
default specification strategy is developed by considering the limiting case of the model as 
a — )> O'*" and 0 — )• 0, which results in a single normal distribution for Y^. In the limit, with 
I /Xt,S ~ N(/Xi,5]) and \ m,V ^ N{m,V), we find E(l"i*) = and Cov{Y*^) = 
Bm -|- By{av — d—1) ^ a^Boii' — d—1) where d is the response-covariate dimension, 
here d = 3. The only covariate information we require is an approximate center (such as 
the midpoint of the data) and range, denoted by and for X, and analogously for 
U. We use cf and r^/4 as proxies for the marginal mean and standard deviation of X. 
We also seek to scale the latent variables appropriately. The centers and ranges c“ and 
provide approximate centers c'^ and ranges r"' of latent log-age W. Since Y is supported on 
{1,..., C}, latent continuous Z must be supported on values slightly below 71 up to slightly 
above 7 c-i) so that r^/4 is a proxy for the standard deviation of Z, where = ( 7 C -1 — 7 i)- 
Using these mean and variance proxies, we fix am = (0, c“,c^). Each of the three terms 
in Cov(l^j) can be assigned an equal part of the total covariance, for instance being set to 
3“^diag{(r^/4)^, (r^'/d)^, (r*/4)^}. For dispersed but proper priors, u, ay and an can be 
fixed to small values such as d + 2, and Bm, By, and Bo determined accordingly. 

It remains to specify mo and Vq, the mean and covariance for the initial distributions 
Hi i- We propose a fairly conservative specihcation, noting that in the limit, E(l^^) = 
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mo, and CovCK^) = a^Boiv — d — 1)“^ + ^o- Therefore, mg can be specified in the 
same way as a^n but using only the subset of data at t = 1, and Vq can be set to 
diag{(r^/4)^, (r5"/4)^, (rf/4)^} — aoBoii^ — d — 1)“^, where the subscript 1 indicates the 
subset of data at t = 1. 

In simulation studies and the rockfish application, we observed a moderate to large amount 
of learning for all hyperparameters. For instance, for the rockfish data, the posterior distri¬ 
bution for (j) was concentrated on values close to 1, indicating the DDP weights are strongly 
correlated across time. There was also moderate learning for a as its posterior distribution 
was concentrated around 0.5, with small variance, shifted down relative to the prior which 
had expectation 2. The posterior distribution for each element of m was reduced in variance 
and concentrated on values not far from those indicated by the prior mean. The posterior 
samples for the covariance matrices V and D supported smaller variance components than 
suggested by the prior. 


3.3 Results 


Various simulation settings were developed to study both the common atoms version of the 


model and the more general version (DeYoreo, 2014, chapter 4). While we focus only on the 
fish maturity data application in this paper, our extensive simulation studies have revealed 
the inferential power of the model under different scenarios for the true latent response 
distribution and ordinal regression relationships. 

We first discuss inference results for quantities not involving maturity. As illustrated with 
results for six years in Figure the estimates for the density of length display a range of 
shapes. The interval estimates reflect the different sample sizes in these years; for instance, 
in 1994 there are 271 observations, whereas in 2000 and 2004 there are only 64 and 37 
observations, respectively. A feature of our modeling approach is that inference for the 
density of age can be obtained over a continuous scale. The corresponding estimates are 
shown in Figure for the same years as length. 

The posterior mean surfaces for the bivariate density of age and length are shown in 
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Figure 3: Posterior mean and 95% interval estimates for the density of length (in millimeters) 
across six years, with the data shown as a histogram. 


t=1994 t=1996 t=1998 



0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 

age age age 


Figure 4: Posterior mean and 95% interval estimates for the density of age on a continuous 
scale across six years, with the data shown as a histogram. 
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Figure 5: Posterior mean estimates for the bivariate density of age and length across all years. 
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Figure 6: Posterior mean and 95% interval bands for the expected value of length over 
(continuous) age, E(X \ U* = u*; Gt), across three years. Overlaid are the data (in blue) and 
the estimated von Bertalanffy growth curves (in red). 


Figure]^ for all years, including the ones (years 2003, 2005 and 2006) for which data is not 
available. The model yields more smooth shapes for the density estimates in these years. 
An ellipse with a slight “banana” shape appears at each year, though some nonstandard 
features and differences across years are present. In particular, the density in year 2002 
extends down farther to smaller ages and lengths; this year is unique in that it contains a 
very large proportion of the young hsh which are present in the data. One can envision 
a curve going through the center of these densities, representing E(X | U* = u*;Gt), for 
which we show posterior mean and 95% interval bands for three years in Figure The 
estimates from our model are compared with the von Bertalanffy growth curves for length- 
at-age, which are based on a particular function of age and three parameters (estimated here 
using nonlinear least squares). It is noteworthy that the nonparametric mixture model for the 
joint distribution of length and age yields estimated growth curves which are overall similar 
to the von Bertalanffy parametric model, with some local differences especially in year 2002. 
The uncertainty quantification in the growth curves afforded by the nonparametric model is 
important, since the attainment of unique growth curves by group (i.e. by location or cohort) 
is often used to suggest that the groups differ in some way, and this type of analysis should 
clearly take into account the uncertainty in the estimated curves. 
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The last year 2007, in addition to containing few observations, is peculiar. There are no 
fish that are younger than age 6 in this year, and most of the age 6 and 7 fish are recorded 
as immature, even though in all years combined, less than 10% of age 6 as well as age 7 fish 
are immature. This year appears to be an anomaly. As there are no observations in 2005 or 
2006, and a small number of observations in 2007 which seem to contradict the other years 
of data, hereinafter, we report inferences only up to 2004. 

Inference for the maturation probability curves is shown over length and age in Figures 
and|^ The probability that a fish is immature is generally decreasing over length, reaching a 
value near 0 at around 350 mm in most years. There is a large change in this probability over 
length in 2002 and 2004 as compared to other years, as these years suggest a probability close 
to 1 for very small fish near 200 to 250 mm. Turning to age, the probability of immaturity 
is also decreasing with age, also showing differences in 2002 and 2004 in comparison to other 
years. There is no clear indication of a general trend in the probabilities associated with 
levels 2 or 3. Years 1995-1997 and 1999 display similar behavior, with a peak in probability 
of post-spawning mature for moderate length values near 350 mm, and ages 6-7, favoring 
pre-spawning mature fish at other lengths and ages. The last four years 2001-2004 suggest 
the probability of pre-spawning mature to be increasing with length up to a point and then 
leveling off, while post-spawning is favored most for large fish. Post-spawning appears to have 
a lower probability than pre-spawning mature for any age at all years, with the exception of 
1998, for which the probability associated with post-spawning is very high for older fish. 

The Pacific States Marine Fisheries Commission states that all Chilipepper rockfish are 
mature at around 4-5 years, and at size 304 to 330 mm. A stock assessment produced by 


the Pacific Fishery Management Council (Field 2009) fitted a logistic regression to model 
maturity over length, from which it appears that 90% of fish are mature around 300-350 
mm. As our model does not enforce monotonicity on the probability of maturity across age, 
we obtain posterior distributions for the first age greater than 2 (since biologically all fish 
under 2 should be immature) at which the probability of maturity exceeds 90%, given that it 
exceeds 90% at some point. That is, for each posterior sample we evaluate Pr(Y > 1 | u*; Gt) 
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Figure 7: Posterior mean (black lines) and 95% interval estimates (gray shaded regions) 
for the marginal ordinal probability curves associated with length. Category 1 (immature) 
given by solid line, category 2 (pre-spawning mature) given by dashed line, and category 3 
(post-spawning mature) shown as a dotted line. 
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Figure 8: Posterior mean (black lines) and 95% interval estimates (gray shaded regions) for 
the marginal ordinal probability curves associated with age. Category 1 (immature) given 
by solid line, category 2 (pre-spawning mature) given by dashed line, and category 3 (post¬ 
spawning mature) shown as a dotted line. 
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Figure 9: Posterior mean and 90% intervals for the smallest value of age above 2 years at 
which probability of maturity first exceeds 90% (left), and similar inference for length (right). 


over a grid in u* beginning at 2, and find the smallest value of u* at which this probability 
exceeds 0.9. Note that there were very few posterior samples for which this probability did 
not exceed 0.9 for any age (only 4 samples in 1993 and 8 in 2003). The estimates for age at 
90% maturity are shown in the left panel of Figure]^ The model uncovers a (weak) U-shaped 
trend across years. Also noteworthy are the very narrow interval bands in 2002. Recall that 
this year contains an abnormally large number of young fish. In this year, over half of fish 
age 2 (that is, of age 2-3) are immature, and over 90% of age 3 (that is, of age 3-4) fish are 
mature, so we would expect the age at 90% maturity to be above 3 but less than 4, which 
our estimate confirms. A similar analysis is performed for length (right panel of Figure 
suggesting a trend over time which is consistent with the age analysis. 

Due to the monotonicity in the maturity probability curve in standard approaches, and 
the fact that age and length are treated as fixed, the point at which maturity exceeds a 
certain probability is a reasonable quantity to obtain in order to study the age or length at 
which most fish are mature. However, since we are modeling age and length, we can obtain 
their entire distribution at a given maturity level. These are inverse inferences, in which we 
study, for instance, f{x \ Y = l;Gt) as opposed to Pr(y = 1 | x-,Gt). It is most informative 
to look at age and length for immature fish, as this makes it clear at which age or length 
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there is essentially no probability assigned to the immature category. The posterior mean 


estimates for f{u*,x \ Y = Y,Gt) are shown in Figure 10 


3.4 Model Checking 

Here, we discuss results from posterior predictive model checking. In particular, we generated 
replicate data sets from the posterior predictive distribution, and compared to the real data 


using specific test quantities (Gelman et ah, 2004). Using the MCMC output, we simulate 


replicate data sets of the same size as the original data. We then choose 

some test quantity, T{{yti^wti-,xti})-, and for each replicate data set at time t, determine 
the value of the test quantity and compare the distribution of test quantities with the value 
computed from the actual data. To obtain Figure [TT| we computed, for each replicate sample, 
the proportion of age 6 fish that were of maturity levels 1 and 2. Boxplots of these proportions 
are shown, with the actual proportions from the data indicated as blue points. The width of 
each box is proportional to the number of age 6 fish in that year. Figure [l2] refers to fish of at 
least age 7 with length larger than 400 mm. Finally, Figure [T^ plots the sample correlation 
of length and age for fish of maturity level 2. The results reported in Figures [TH [T^ and 13 
suggest that the model is predicting data which is very similar to the observed data in terms 
of practically important inferences. 

Although results are not shown here, we also studied residuals with cross-validation, ran¬ 
domly selecting 20% of the observations in each year and rehtting the model, leaving out 
these observations. We obtained residuals yu — E(y | W = wti,X = xti',Gt) for each obser¬ 
vation {yti,wu,xu) which was left out. There was no apparent trend in the residuals across 
covariate values, that is, no indication that we are systematically under or overestimating 
fish maturity of a particular length and/or age. 
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Figure 10; Posterior mean estimate for the bivariate density of age and length for immature 
fish across years. The data on age and length of immature fish is overlaid in each plot. 
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Figure 11: Distributions of the proportion of age 6 fish that were of maturity level 1 (left 
panel) and 2 (right panel) in the replicated data sets are shown as boxplots, with width 
proportional to the number of age 6 fish in each year. The actual proportion from the data 
is given as a blue circle. 
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Figure 12: Distributions of the proportion of fish age 7 and above and length larger than 400 
mm that were of maturity level 1 (left panel) and 2 (right panel) in the replicated data sets 
are shown as boxplots, with width proportional to the number fish of this age and length in 
each year. The actual proportion from the data is given as a blue circle. 
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corr(u,x); y=2 



Figure 13: Distributions of the sample correlation between length and age for fish that were 
of maturity level 2 in the replicated data sets are shown as boxplots, with width proportional 
to the number of level 2 fish in each year. The sample correlation based on the data is given 
as a blue circle. 

4 Discussion 

The methods developed for dynamic ordinal regression are widely applicable to modeling 
mixed ordinal-continuous distributions indexed in discrete time. At any particular point 
in time, the DP mixture representation for the latent response-covariate distribution is re¬ 
tained, enabling flexible inference for a variety of functionals, and allowing standard posterior 
simulation techniques for DP mixture models to be utilized. 

In contrast to standard approaches to ordinal regression, the model does not force spe¬ 
cific trends, such as monotonicity, in the regression functions. We view this as an attribute 
in most settings. Nevertheless, in situations in which it is believed that monotonicity ex¬ 
ists, we must realize that the data will determine the model output, and may not produce 
strictly monotonic relationships. In the fish maturity example, it is generally accepted that 
monotonicity exists in the relationship between maturity and age or length. Although our 
model does not enforce this, the inferences generally agree with what is expected to be true 
biologically. Our model is also extremely relevant to this setting, as the covariates age and 
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length are treated as random, and the ordinal nature of recorded age is accounted for using 
variables which represent underlying continuous age. The set of inferences that are provided 
under this framework, including estimates for length as a function of age, make this modeling 
approach powerful for the particular application considered, as well as related problems. 

While year of sampling was considered to be the index of dependence in this analysis, an 
alternative is to consider cohort as an index of dependence. All fish born in the same year, or 
the same age in a given year, represent one cohort. Grouping fish by cohort rather than year of 
record should lead to more homogeneity within a group, however there are also some possible 
issues since fish will generally be younger as cohort index increases. This is a consequence 
of having a particular set of years for which data is collected, i.e., the cohort of fish born in 
2006 can not be older than 4 if data collection stopped in 2009. Due to complications such 
as these, combined with exploration of the relationships within each cohort, we decided to 
treat year of data collection as the index of dependence, but cohort indexing could be more 
appropriate in other analyses of similar data structures. 

The proposed modeling approach could also be useful in applications in finance. One 
such example arises in the analysis of price changes of stocks. In the past, stocks traded 
on the New York Stock Exchange were priced in eighths, later moved to sixteenths, and 
corporate bonds still trade in eighths. In analyzing price changes of stocks which are traded 
in fractions, it is inappropriate to treat the measurements as continuous, particularly if the 


range of values is not very large (e.g., Muller and Czado, 2009). The price changes should 
be treated with a discrete response model, and the possible responses are ordered, ranging 
from a large negative return to a large positive return. One possible analysis may involve 
modeling the monthly returns as a function of covariates such as trade volume, and must take 
into account the ordinal nature of the responses. In addition, the distribution of returns in a 
particular month is likely correlated with the previous month, and the regression relationships 
must be allowed to be related from one month to the next. In finance as well as environmental 
science, empirical distributions may exhibit non-standard features which require more general 
methods, such as the nonparametric mixture model developed here. 
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A Properties of the DDP Prior Model 

Here, we provide derivations of the various correlations associated with the DDP prior, as 
given in Section |2.3[ 
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Autocorrelation of (/3t,/3t+fc) 

Since the process is stationary with j3t ~ beta(a, 1) at any time t, E(/3t \ a) = a/{a + 1) and 
var(/3t I a) = a/{{a + l)^(a + 2)}. We also have 

E(/3tA+fc I = E{exp(-CV«)}E{exp(-(r/i + /2a)} (A.l) 

using the definition of the B process in Q. The first expectation can be obtained through 
the moment generating function of ~ yf, which is given by E(e*‘’ ) = (1 — for 

t < 1/2. Hence, for t = —l/a, we obtain E {exp(—C^/a)} = Q!^/^/(2 + Regarding 

the second expectation, note that {T]t,'i]t+k) ~ ^(0,(7^), with Ck a covariance matrix with 
diagonal elements equal to 1 and off-diagonal elements equal to pk- Integration results in 
E {exp(-(r 7 / -|- r]/^i.)/2a)} = 0(1 - piy^‘^/{{l - Pk + The correlation in ^ 

results by combining the terms above with the expressions for E(/3t | a) and va,r{l3t \ a). 

Autocorrelation of DP weights 

Eirst, E{pi^t I a) = E{(1 - Pi^t) n!-=i f3r,t I a}- Since the j3pt are independent across /, and 
E(A,i I a) = a/{a + 1), we obtain E{pi^t \ a) = + a)K Similarly, E(p/^ | a) = 

— I a} n!-=i I = 2a*“^/{(a-|-l)(a-|-2)^}, from which var(p;^t | a) obtains. 

Since pptPi,t+i = (1 - A,t)(l - n!-=iand is independent of 

(/3m,t, /3m,i+i), for any / / m, we can write 


i-i 

E{pptPi,t+i !«,</) = E{(1 - A,0(1 “ Pi,t+i) I a, E(/3r,t/3r,t-|-l I «,</>)■ 

r=l 

The required expectations in the above equation can be obtained from (A.l) for k = 1, such 
that Pi = (/). Combining the above expressions yields the covariance in Q. 

Autocorrelation of consecutive distributions 

The DDP prior model implies at any t a DP(a, Gq) prior for Gt = Yll^iPi,t^0v where the 6 i 
are i.i.d. from a distribution Go on Hence, for any measurable subset A C M^, we have 
E(Gt(A) I a, Go) = E(Gi+i(A) | a, Go) = Go(A), and var(Gi(A) | a, Go) = var(Gt+i(A) | 
a. Go) = Go(A)(l-Go(A))/(a + l). 

The additional expectation needed in order to obtain corr(Gt(A), Gi+i(A) | (/>, a,Go) is 
E(Gt(A)Gt+i(A) I 0,a,Go) = E{{Y)/ZiPi,t^Oi{A)){Y)Z=iPrn.,t+iSern{^)) \ (/,a,Go], which 
can be written as 

)+^(e:x„ ^^^Pi,tPm.,t+i5ei{A)5e^{A)^ . (A.2) 

Eor the first expectation in (A.2), note that pi^tPi,t+i is independent of (de, (A))^, and E{pi^tPi,t+i 
a,4>) has been derived earlier. Moreover, since (Jg,(A))^ is equal to 1 if 0/ G A and 0 oth¬ 
erwise, E((<50j(A))^ I a. Go) = Go(A). Regarding the second expectation in (A.2), we have 
again that pptPm,t+i and 60 ^{A) 5 g^{A) are independent. Here, E{5g^{A)6g^{A) \ a. Go) = 
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(Go(j 4))^, since 6 i and 6 m are independent for m ^ 1. The final step of the derivation in¬ 
volves the expectations E{pi^tPm,t+i \ for m ^ 1. Note that, if Z < m, pi^tPm,t+i = 

(nt=i l^r,tPr,t+i) - /3i,t) (n^^ 7 +i Pr,t+i) (1 “ /3m,t+i), and an analogous expression can 

be written when m < 1. Therefore, for general m ^ I, pi^tPm,t+i can be expressed as a product 
of niax{Z,m} independent components, since each component comprises one or two of the 
random variables in {/3i^t} and {/3m,t+i}, in the latter case having the same first subscript. 
Thus, E{pi^tPm,t+i I «, (j)) can be developed through products of expectations of the form 
E(A,f I a) and \ which have been obtained earlier. 


B Posterior Simulation Details 

We derive the posterior full conditionals and provide updating strategies for many of the 
parameters of the hierarchical model in Section |3.2[ 


Updating the weights 

The full conditional for ({0}, {viA) is given by p({0}, I • • • > data) oc 

N-l T N-l T nt N 

n i)N(??u;o> 1) n n 1 - </>^) nnu Pi,A{Lt,i)- 

1=1 t=2 1=1 t=li=ll=l 

Write W'iLiYA=iPl,th{Lt,i) = W^=iP^i’\ where Mi^t =\ : Lt^i = 1} |, i.e., the number 

of observations at time t assigned to component 1. Filling in the form for {pi^t} gives 


nt N / 

n = 1 - exp 

i=l 1=1 V 


^2 I \ \ W.t 


Cl + vit 


MN,tY:i=-i\Qf+^it) 


2 a 


N-l 

n 

1=2 


1 — exp — 


Cl + vl 

2 a 


exp ^ 
\ 


2a 


Ml,tICiJliCr + Vr,t) 


exp 


2a 


The full conditional for each Q, I = 1,..., N — 1, is therefore 

>2 v^T 71^ \ T 


p{Ci I ..., data) oc exp ( -y ) exp 


giving 


-cf Ei=i Eih+i M, 


r,t 


2a 


nb 


exp 


i=l 


Cf + ril 

2 a 


M,, 


T N T / 

p{Ci\ data) oc N(0; 0, (1 + ^ ^ | 1 - exp 

t=l r=l+l 


t=l 


Cl + 

2 a 


We use a slice sampler to update Ch with the following steps: 

" W,A 


Draw Ut ~ uniform 0, I 1 — exp I — 


(f+^lt 


2a 


, for t = 1,... ,r. 
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• Draw Q r\j 


Cl-Ut < 


N(0, (1 + a ^ Ylt=i'l2^=i+i ^r,t) ^)) restricted to the lie in the interval 
1 — exp f — ^ ^ ,t = l,...,Tl. Solving for Cl in each of these 


T equations gives Cf > ~vft ~ 2alog(l — for t = Therefore, if 

—r]f^ — 2Q;log(l — < 0 for all t, then Cl has no restrictions, and is therefore 

sampled from a normal distribution. Otherwise, if —'r]f^ — 2alog(l — > 0 for 

some t, then | 0 l> niaxt{(——2alog(l —This then requires sampling Cl 
from a normal distribution, restricted to the intervals (—oo, — maxtii—ilft ~ 2alog(l — 
^yA^!,t))i/ 2 })^ and (maxt{(-r/y - 2alog(l - oo). 


In the second step above, we may have to sample from a normal distribution, restricted to two 
disjoint intervals. The resulting distribution is therefore a mixture of two truncated normals, 
with probabilities determined by the (normalized) probability the normal assigns to each in¬ 
terval. These truncated normals both have mean 0 and variance (l + Xl^i Yl^=i+i ^r,t/oi)~^■, 
and each mixture component has equal probability. 

The full conditional for each I = 1,..., N — 1, t = 2,... ,T — 1, is proportional to 




a 






Mr. 


N(r7/,t;0r?/.t_i,l-^^)N(?7/,t+i;0r?/,t,l-(/)^) 1 - exp - 



oc N 


4>a{r]i^t-i +r]i,t+i) 


0 ( 1 - 02 ) 


^^(c^ ~ X]r=/+1 ^r,t) + O + J2r=l+1 ^r,t — J2r=l+1 ^r,t) + O + X]r=; + 1 J 

( ( 

1 - exp - 








2 a 


Each I = 1, •.., — 1, and t = 2,..., T — 1, can therefore be sampled with a slice 

sampler: 


Draw u ~ Unif 0, ' 1 — exp — 


(f+vf. 


2a 


Ml-, 


Draw rii^t ~ N rn^t] 






P'^ia-Y.r^l + l Mr,t)+a+Y,r=l + l Mr,t ’ + i Mr,t) + a+Y,r^l + i Mr,t J ’ 


restricted to < rji^t ■ — exp ^ ^ i, giving > —2alog(l—— 


Cf- 


In the second step above, we will again either sample from a single normal or a mixture 
of truncated normals, where each normal has the same mean and variance, but the trun¬ 
cation intervals differ. Since the mean of this normal is not zero, the weights assigned to 
each truncated normal are not the same. The unnormalized weight assigned to the nor¬ 
mal which places positive probability on ((—2Q;log(l — — Cf)^^^;Co) is given by 
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1 —F((—2alog(l ——where F is the CDF of the normal for iji^t given in the sec¬ 
ond step. The unnormalized weight given to the component which places positive probability 
on (— 00 , —(— 2 alog(l — is given by F(—(— 2 alog(l — 

The full conditionals for r/; 1 and rji^x are slightly different. The full conditional for r]i^i is 

p(r7/,i I ...,data) (xN{r]ix,0, -)N(r7/ 1 ; 0,1)N(?7;,2; ( 1 - exp 

Er^l+lMr,! V 



ocN rjix, 


4>(X'ni,2 


a{l — (fP') 


^ + Z]r=z-|-1 ^r,l (fp Z]r=z-|-1 « + '^r=l+l ^r,l (jP Z]r=Z-|-l ^r,l ^ 


1 — exp — 


Cf + 

2 a 


Mi,i 


For r^ix, we have: 


a 


p(? 7 z,r I • • •, data) oc N( 77 z,r; 0, —^ 

l^r=l+l ^r,T 

which is proportional to 


)N(r//,T; 4>'ni,T-i, 1 - exp - 


Cf + vh 

2 a 




(t>0!r]ix-i 


a{l — (jP) 


« + E^Z-hl ^r,T - (fP Ylr=l+l ^r,T « + E^Z+1 ^r,T “ (jP Y.r=l+l ^r,T ^ 


1 — exp — 


<f+X \Y" 


The slice samplers for rji^i and rji^ are therefore implemented in the same way as for 77 /^t, 
except the normals which are sampled from have different means and variances. 


Updating a 

The full conditional for a is 




p{a I ..., data) cx p{a) exp — 


2 a 

\ 

T N-1 / / 

n J] ( 1 - exp ( - 

t=l 1=1 \ \ 

Therefore, with p(a) = IG(aQ,, ba), we have 


exp -- 


Ml,ErJiier+vit) 

2 a 


Cj + nt, 

2 a 


Ml-. 


N-1 


N-1 1-1 


p{a I ..., data) = IG a; Oq, 6q, - E Mn,! ^ {Cl + vf.t) + ^ Ml, ^(Cr + ^r,t 


t=l 


1=1 


1=2 


r=l 
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T N-1 


nn i-exp - 


t=i 1=1 


C? + vl 

2 a 


Ml, 


The parameter a can be sampled using a Metropolis-Hastings algorithm. In particular we 
work with log(a), and use a normal proposal distribution centered at the log of the current 
value of a. 


Updating cj) 

The full conditional for the AR parameter (j) is 

T 7V-1 

p{(f> I ..., data) oc p((/>) nn 

t=2 l=l 

oc (1 - ^ ^ 2(1 - ^ 2 ) p{(f) 

We assume p{4’) = uniform(0,1) or p{(p) = uniform(—1,1), and apply a Metropolis-Hastings 
algorithm to sample log log ’ ’^ospectively, using a normal proposal distribu¬ 

tion. 

Updating {/X; J 

The updates for pin are N(m*, V*), with m* and V* given by: 

• For t = 2,..., T—1, if Mi^t = 0, then the update for p; ^ has V* = 

and m* = V*{V~^{m -|- + (0“^U0“'^)“^0“^(/2; — m)) 

• For t = 2,..., T—1, if Mi^t / 0, then the update for p; ^ has V* = (U“^-|-(0“^U0“'^)“^-|- 

and m* = V*{V~^{m + - m) + 

Y2{i:Lt^i=l} yt,i) 

• for t = 1, if M; 1 = 0, then the update for ^ has V* = ((0“^U0“^)“^ -|- 

and m* = V* {pLi 2 — m) + Uq ^mo) 

• for t = 1, if / 0, then the update for pii ^ has V* = -|- (0“^U0“^)“^ -|- 

Ug and m* = T.{r.L,,i=i} Vpi + ( 0 “^'V^ 0 “^)“^ 0 “nR /,2 - m) + Ug ^mo) 

• for t = T, if Mi^t = 0, then the update for ha-s V* = V, and m* = m + 

• for t = T, if Mi^t / 0, then the update for has V* = -|- and 

m* = E{.l,,=4 yt,^ + v-\m + Qp.i,T-i)) 

Updating {S;} 

The posterior full conditional for T); is proportional to IW(j^ -|- Mi,D + Il{{t,iy.Lu=i}iyt,i - 
yi){yt,i ~ When : Lt^i = Z}| = 0, is drawn from IW{iy,D). 
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